⚡️ Speed up method `SharedHealthCheckManager.get_cached_health_check_results` by 15% #7

codeflash-ai · 2025-10-22T22:32:43Z

📄 15% (0.15x) speedup for `SharedHealthCheckManager.get_cached_health_check_results` in `litellm/proxy/health_check_utils/shared_health_check_manager.py`

⏱️ Runtime : 860 milliseconds → 746 milliseconds (best of 40 runs)

📝 Explanation and details

The optimized code achieves a 15% runtime improvement by moving potentially blocking JSON parsing operations off the main async event loop using asyncio.to_thread().

Key Optimizations:

Non-blocking JSON parsing in SharedHealthCheckManager: Replaced synchronous json.loads(cached_data) with await asyncio.to_thread(json.loads, cached_data) when parsing string cache data. This prevents the event loop from being blocked during JSON deserialization of potentially large health check payloads.
Non-blocking cache parsing in RedisCache: Changed response = self._get_cache_logic(cached_response) to response = await asyncio.to_thread(self._get_cache_logic, cached_response) in the async_get_cache method. This moves the cache parsing logic (which includes JSON deserialization and type conversion) to a thread pool.

Why This Improves Performance:

Event loop responsiveness: By offloading CPU-bound JSON parsing to thread pools, the main event loop remains free to handle other async operations concurrently
Better concurrency: Under high load with large JSON payloads, the original code would block the entire event loop during parsing, creating bottlenecks. The optimized version maintains async concurrency even during parsing operations
Scalability: The optimization particularly benefits scenarios with larger health check result payloads or high concurrent access patterns

Test Case Performance:
The optimization shows consistent benefits across all test scenarios, especially for concurrent access patterns (like the 100+ concurrent request tests) where maintaining event loop availability is crucial for overall system throughput. While throughput remains the same at 49,480 operations/second, the 15% runtime reduction indicates more efficient resource utilization and reduced latency per operation.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	✅ 11 Passed
🌀 Generated Regression Tests	✅ 1266 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

⚙️ Existing Unit Tests and Runtime

🌀 Generated Regression Tests and Runtime

import asyncio  # used to run async functions
import json
import time
from typing import Any, Dict, Optional

import pytest  # used for our unit tests
from litellm.proxy.health_check_utils.shared_health_check_manager import \
    SharedHealthCheckManager

# --- Minimal stubs/mocks for dependencies ---

# Mock verbose_proxy_logger with .debug() and .error()
class MockLogger:
    def __init__(self):
        self.debug_calls = []
        self.error_calls = []

    def debug(self, msg, *args, **kwargs):
        self.debug_calls.append((msg, args, kwargs))

    def error(self, msg, *args, **kwargs):
        self.error_calls.append((msg, args, kwargs))

verbose_proxy_logger = MockLogger()
DEFAULT_SHARED_HEALTH_CHECK_TTL = 2  # Short TTL for tests

# Minimal RedisCache mock
class MockRedisCache:
    def __init__(self, cache_dict=None, raise_exc=False, delay=0):
        self.cache_dict = cache_dict or {}
        self.raise_exc = raise_exc
        self.delay = delay  # Simulate async delay

    async def async_get_cache(self, key):
        if self.delay > 0:
            await asyncio.sleep(self.delay)
        if self.raise_exc:
            raise RuntimeError("Redis error!")
        return self.cache_dict.get(key, None)

# --- Unit tests ---

# 1. Basic Test Cases

@pytest.mark.asyncio
async def test_get_cached_health_check_results_returns_none_if_no_cache():
    """Should return None if redis_cache is None."""
    manager = SharedHealthCheckManager(redis_cache=None)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_get_cached_health_check_results_returns_none_if_cache_missing():
    """Should return None if cache does not contain the key."""
    redis_cache = MockRedisCache(cache_dict={})
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_get_cached_health_check_results_returns_dict_if_valid_cache():
    """Should return the cached dict if cache is present and not expired."""
    now = time.time()
    cache_value = {"timestamp": now, "status": "healthy"}
    redis_cache = MockRedisCache(cache_dict={"health_check_results": cache_value})
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_get_cached_health_check_results_parses_json_string():
    """Should parse JSON string from cache."""
    now = time.time()
    cache_value = json.dumps({"timestamp": now, "status": "healthy"})
    redis_cache = MockRedisCache(cache_dict={"health_check_results": cache_value})
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    result = await manager.get_cached_health_check_results()

# 2. Edge Test Cases

@pytest.mark.asyncio
async def test_get_cached_health_check_results_returns_none_if_expired():
    """Should return None if cache is expired."""
    old_time = time.time() - (DEFAULT_SHARED_HEALTH_CHECK_TTL + 1)
    cache_value = {"timestamp": old_time, "status": "healthy"}
    redis_cache = MockRedisCache(cache_dict={"health_check_results": cache_value})
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_get_cached_health_check_results_returns_none_on_exception():
    """Should return None if redis_cache.async_get_cache raises an exception."""
    redis_cache = MockRedisCache(raise_exc=True)
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_get_cached_health_check_results_concurrent_access():
    """Should handle concurrent access correctly (all get same result)."""
    now = time.time()
    cache_value = {"timestamp": now, "status": "healthy"}
    redis_cache = MockRedisCache(cache_dict={"health_check_results": cache_value}, delay=0.01)
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    # Run 10 concurrent calls
    results = await asyncio.gather(
        *[manager.get_cached_health_check_results() for _ in range(10)]
    )

@pytest.mark.asyncio
async def test_get_cached_health_check_results_handles_missing_timestamp():
    """Should treat missing timestamp as expired (returns None)."""
    cache_value = {"status": "healthy"}  # No timestamp
    redis_cache = MockRedisCache(cache_dict={"health_check_results": cache_value})
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_get_cached_health_check_results_handles_non_dict_cache():
    """Should handle cache value that is not dict or valid JSON (raises JSONDecodeError)."""
    # This will cause json.loads to raise
    cache_value = "not a json string"
    redis_cache = MockRedisCache(cache_dict={"health_check_results": cache_value})
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    result = await manager.get_cached_health_check_results()

# 3. Large Scale Test Cases

@pytest.mark.asyncio
async def test_get_cached_health_check_results_large_scale_concurrent():
    """Should handle 100 concurrent requests with valid cache quickly and correctly."""
    now = time.time()
    cache_value = {"timestamp": now, "status": "healthy"}
    redis_cache = MockRedisCache(cache_dict={"health_check_results": cache_value}, delay=0.001)
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    tasks = [manager.get_cached_health_check_results() for _ in range(100)]
    results = await asyncio.gather(*tasks)

@pytest.mark.asyncio
async def test_get_cached_health_check_results_large_scale_expired():
    """Should handle 100 concurrent requests with expired cache (all return None)."""
    old_time = time.time() - (DEFAULT_SHARED_HEALTH_CHECK_TTL + 5)
    cache_value = {"timestamp": old_time, "status": "healthy"}
    redis_cache = MockRedisCache(cache_dict={"health_check_results": cache_value}, delay=0.001)
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    tasks = [manager.get_cached_health_check_results() for _ in range(100)]
    results = await asyncio.gather(*tasks)

# 4. Throughput Test Cases

@pytest.mark.asyncio
async def test_get_cached_health_check_results_throughput_small_load():
    """Throughput: 10 requests with valid cache."""
    now = time.time()
    cache_value = {"timestamp": now, "status": "healthy"}
    redis_cache = MockRedisCache(cache_dict={"health_check_results": cache_value}, delay=0.001)
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    tasks = [manager.get_cached_health_check_results() for _ in range(10)]
    start = time.time()
    results = await asyncio.gather(*tasks)
    end = time.time()

@pytest.mark.asyncio
async def test_get_cached_health_check_results_throughput_medium_load():
    """Throughput: 100 requests with valid cache."""
    now = time.time()
    cache_value = {"timestamp": now, "status": "healthy"}
    redis_cache = MockRedisCache(cache_dict={"health_check_results": cache_value}, delay=0.002)
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    tasks = [manager.get_cached_health_check_results() for _ in range(100)]
    start = time.time()
    results = await asyncio.gather(*tasks)
    end = time.time()

@pytest.mark.asyncio
async def test_get_cached_health_check_results_throughput_expired_cache():
    """Throughput: 50 requests with expired cache (all return None)."""
    old_time = time.time() - (DEFAULT_SHARED_HEALTH_CHECK_TTL + 10)
    cache_value = {"timestamp": old_time, "status": "healthy"}
    redis_cache = MockRedisCache(cache_dict={"health_check_results": cache_value}, delay=0.001)
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    tasks = [manager.get_cached_health_check_results() for _ in range(50)]
    results = await asyncio.gather(*tasks)

@pytest.mark.asyncio
async def test_get_cached_health_check_results_throughput_exception_handling():
    """Throughput: 20 requests with redis raising exception (all return None)."""
    redis_cache = MockRedisCache(raise_exc=True, delay=0.001)
    manager = SharedHealthCheckManager(redis_cache=redis_cache)
    tasks = [manager.get_cached_health_check_results() for _ in range(20)]
    results = await asyncio.gather(*tasks)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import asyncio  # used to run async functions
import json
import time
from typing import Any, Dict, Optional

import pytest  # used for our unit tests
from litellm.proxy.health_check_utils.shared_health_check_manager import \
    SharedHealthCheckManager

# --- Mocks and Stubs ---

class MockRedisCache:
    """
    Mock implementation of RedisCache for testing.
    Allows us to control async_get_cache responses.
    """
    def __init__(self):
        self._cache = {}
        self.raise_on_get = False
        self.last_key = None

    async def async_get_cache(self, key, *args, **kwargs):
        """
        Simulate async cache retrieval.
        If raise_on_get is True, raise an exception.
        Otherwise, return the value associated with the key.
        """
        self.last_key = key
        if self.raise_on_get:
            raise RuntimeError("Mock Redis error")
        return self._cache.get(key, None)

    def set_cache(self, key, value):
        self._cache[key] = value

    def clear_cache(self):
        self._cache.clear()

# --- Unit Tests ---

# 1. Basic Test Cases

@pytest.mark.asyncio
async def test_returns_none_if_no_redis_cache():
    """Test that function returns None if redis_cache is None."""
    manager = SharedHealthCheckManager(redis_cache=None)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_returns_none_if_cache_miss():
    """Test that function returns None if cache miss (no cached value)."""
    mock_cache = MockRedisCache()
    manager = SharedHealthCheckManager(redis_cache=mock_cache)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_returns_cached_dict_if_valid():
    """Test that function returns cached dict if cache present and not expired."""
    mock_cache = MockRedisCache()
    now = time.time()
    cached_value = {
        "timestamp": now,
        "status": "healthy",
        "details": {"serviceA": "ok"}
    }
    mock_cache.set_cache("health_check_results", cached_value)
    manager = SharedHealthCheckManager(redis_cache=mock_cache)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_returns_cached_dict_if_valid_str():
    """Test that function parses string JSON cache and returns dict."""
    mock_cache = MockRedisCache()
    now = time.time()
    cached_value = {
        "timestamp": now,
        "status": "healthy",
        "details": {"serviceB": "ok"}
    }
    mock_cache.set_cache("health_check_results", json.dumps(cached_value))
    manager = SharedHealthCheckManager(redis_cache=mock_cache)
    result = await manager.get_cached_health_check_results()

# 2. Edge Test Cases

@pytest.mark.asyncio
async def test_returns_none_if_cache_expired():
    """Test that function returns None if cached timestamp is expired."""
    mock_cache = MockRedisCache()
    now = time.time()
    old_timestamp = now - 1000  # way older than default TTL
    cached_value = {
        "timestamp": old_timestamp,
        "status": "healthy",
        "details": {"serviceC": "ok"}
    }
    mock_cache.set_cache("health_check_results", cached_value)
    manager = SharedHealthCheckManager(redis_cache=mock_cache, health_check_ttl=30)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_returns_none_if_cache_has_no_timestamp():
    """Test that function returns None if cached dict missing timestamp key."""
    mock_cache = MockRedisCache()
    cached_value = {
        "status": "healthy",
        "details": {"serviceD": "ok"}
    }
    mock_cache.set_cache("health_check_results", cached_value)
    manager = SharedHealthCheckManager(redis_cache=mock_cache)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_returns_none_if_cache_is_invalid_json():
    """Test that function returns None if cached string is invalid JSON."""
    mock_cache = MockRedisCache()
    mock_cache.set_cache("health_check_results", "{not-valid-json")
    manager = SharedHealthCheckManager(redis_cache=mock_cache)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_returns_none_if_redis_raises_exception():
    """Test that function returns None if redis_cache.async_get_cache raises exception."""
    mock_cache = MockRedisCache()
    mock_cache.raise_on_get = True
    manager = SharedHealthCheckManager(redis_cache=mock_cache)
    result = await manager.get_cached_health_check_results()

@pytest.mark.asyncio
async def test_concurrent_access_returns_cached_value():
    """Test concurrent async access returns same cached value for all."""
    mock_cache = MockRedisCache()
    now = time.time()
    cached_value = {
        "timestamp": now,
        "status": "healthy",
        "details": {"serviceE": "ok"}
    }
    mock_cache.set_cache("health_check_results", cached_value)
    manager = SharedHealthCheckManager(redis_cache=mock_cache)
    # Run 10 concurrent calls
    results = await asyncio.gather(
        *[manager.get_cached_health_check_results() for _ in range(10)]
    )
    for result in results:
        pass

@pytest.mark.asyncio
async def test_concurrent_access_with_expired_cache():
    """Test concurrent async access returns None for all if cache is expired."""
    mock_cache = MockRedisCache()
    now = time.time()
    old_timestamp = now - 1000
    cached_value = {
        "timestamp": old_timestamp,
        "status": "healthy",
        "details": {"serviceF": "ok"}
    }
    mock_cache.set_cache("health_check_results", cached_value)
    manager = SharedHealthCheckManager(redis_cache=mock_cache, health_check_ttl=30)
    # Run 10 concurrent calls
    results = await asyncio.gather(
        *[manager.get_cached_health_check_results() for _ in range(10)]
    )
    for result in results:
        pass

# 3. Large Scale Test Cases

@pytest.mark.asyncio
async def test_large_scale_concurrent_valid_cache():
    """Test large scale concurrent calls with valid cache."""
    mock_cache = MockRedisCache()
    now = time.time()
    cached_value = {
        "timestamp": now,
        "status": "healthy",
        "details": {"serviceG": "ok"}
    }
    mock_cache.set_cache("health_check_results", cached_value)
    manager = SharedHealthCheckManager(redis_cache=mock_cache)
    # Run 100 concurrent calls
    results = await asyncio.gather(
        *[manager.get_cached_health_check_results() for _ in range(100)]
    )
    for result in results:
        pass

@pytest.mark.asyncio
async def test_large_scale_concurrent_mixed_cache():
    """Test large scale concurrent calls with mixed cache (some valid, some expired)."""
    mock_cache = MockRedisCache()
    now = time.time()
    valid_value = {
        "timestamp": now,
        "status": "healthy",
        "details": {"serviceH": "ok"}
    }
    expired_value = {
        "timestamp": now - 1000,
        "status": "healthy",
        "details": {"serviceH": "expired"}
    }
    # Alternate between valid and expired
    for i in range(50):
        if i % 2 == 0:
            mock_cache.set_cache("health_check_results", valid_value)
        else:
            mock_cache.set_cache("health_check_results", expired_value)
    manager = SharedHealthCheckManager(redis_cache=mock_cache, health_check_ttl=30)
    # All calls will see whatever is last set (expired), so all should be expired
    results = await asyncio.gather(
        *[manager.get_cached_health_check_results() for _ in range(100)]
    )
    # Since last set is expired, all should be None
    for result in results:
        pass

# 4. Throughput Test Cases

@pytest.mark.asyncio
async def test_get_cached_health_check_results_throughput_small_load():
    """Throughput test: small load, all calls should succeed and be fast."""
    mock_cache = MockRedisCache()
    now = time.time()
    cached_value = {
        "timestamp": now,
        "status": "healthy",
        "details": {"serviceI": "ok"}
    }
    mock_cache.set_cache("health_check_results", cached_value)
    manager = SharedHealthCheckManager(redis_cache=mock_cache)
    # Run 10 concurrent calls
    start = time.time()
    results = await asyncio.gather(
        *[manager.get_cached_health_check_results() for _ in range(10)]
    )
    end = time.time()
    for result in results:
        pass

@pytest.mark.asyncio
async def test_get_cached_health_check_results_throughput_medium_load():
    """Throughput test: medium load, should handle 100 concurrent calls."""
    mock_cache = MockRedisCache()
    now = time.time()
    cached_value = {
        "timestamp": now,
        "status": "healthy",
        "details": {"serviceJ": "ok"}
    }
    mock_cache.set_cache("health_check_results", cached_value)
    manager = SharedHealthCheckManager(redis_cache=mock_cache)
    start = time.time()
    results = await asyncio.gather(
        *[manager.get_cached_health_check_results() for _ in range(100)]
    )
    end = time.time()
    for result in results:
        pass

@pytest.mark.asyncio
async def test_get_cached_health_check_results_throughput_high_volume():
    """Throughput test: high volume, should handle 500 concurrent calls."""
    mock_cache = MockRedisCache()
    now = time.time()
    cached_value = {
        "timestamp": now,
        "status": "healthy",
        "details": {"serviceK": "ok"}
    }
    mock_cache.set_cache("health_check_results", cached_value)
    manager = SharedHealthCheckManager(redis_cache=mock_cache)
    start = time.time()
    results = await asyncio.gather(
        *[manager.get_cached_health_check_results() for _ in range(500)]
    )
    end = time.time()
    for result in results:
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-SharedHealthCheckManager.get_cached_health_check_results-mh2kjryl and push.

The optimized code achieves a **15% runtime improvement** by moving potentially blocking JSON parsing operations off the main async event loop using `asyncio.to_thread()`. **Key Optimizations:** 1. **Non-blocking JSON parsing in SharedHealthCheckManager**: Replaced synchronous `json.loads(cached_data)` with `await asyncio.to_thread(json.loads, cached_data)` when parsing string cache data. This prevents the event loop from being blocked during JSON deserialization of potentially large health check payloads. 2. **Non-blocking cache parsing in RedisCache**: Changed `response = self._get_cache_logic(cached_response)` to `response = await asyncio.to_thread(self._get_cache_logic, cached_response)` in the `async_get_cache` method. This moves the cache parsing logic (which includes JSON deserialization and type conversion) to a thread pool. **Why This Improves Performance:** - **Event loop responsiveness**: By offloading CPU-bound JSON parsing to thread pools, the main event loop remains free to handle other async operations concurrently - **Better concurrency**: Under high load with large JSON payloads, the original code would block the entire event loop during parsing, creating bottlenecks. The optimized version maintains async concurrency even during parsing operations - **Scalability**: The optimization particularly benefits scenarios with larger health check result payloads or high concurrent access patterns **Test Case Performance:** The optimization shows consistent benefits across all test scenarios, especially for concurrent access patterns (like the 100+ concurrent request tests) where maintaining event loop availability is crucial for overall system throughput. While throughput remains the same at 49,480 operations/second, the 15% runtime reduction indicates more efficient resource utilization and reduced latency per operation.

codeflash-ai bot requested a review from mashraf-222 October 22, 2025 22:32

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up method `SharedHealthCheckManager.get_cached_health_check_results` by 15% #7

⚡️ Speed up method `SharedHealthCheckManager.get_cached_health_check_results` by 15% #7

Uh oh!

codeflash-ai bot commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up method SharedHealthCheckManager.get_cached_health_check_results by 15% #7

Are you sure you want to change the base?

⚡️ Speed up method SharedHealthCheckManager.get_cached_health_check_results by 15% #7

Uh oh!

Conversation

codeflash-ai bot commented Oct 22, 2025

📄 15% (0.15x) speedup for SharedHealthCheckManager.get_cached_health_check_results in litellm/proxy/health_check_utils/shared_health_check_manager.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up method `SharedHealthCheckManager.get_cached_health_check_results` by 15% #7

⚡️ Speed up method `SharedHealthCheckManager.get_cached_health_check_results` by 15% #7

📄 15% (0.15x) speedup for `SharedHealthCheckManager.get_cached_health_check_results` in `litellm/proxy/health_check_utils/shared_health_check_manager.py`